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IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

10/758,368 
Simon C. Steely Jr. 
January 15, 2004 

SYSTEM AND METHOD FOR UPDATING OWNER 
PREDICTORS 

2185 

Midys Rojas 
200313752-1 
022879 

Commissioner of Patents 
P.O. Box 1450 

Alexandria, Virginia 22313-1450 

DECLARATION OF SIMON G STEELY JR. PURSUANT TO 37 C.F.R. § 1.131 
Dear Sir: 

1. I am a co-inventor of the apparatus disclosed in U.S. Patent Application No. 
10/758,368 (the '"368 application"). 

2. I conceived the invention presently claimed in the '368 application with co- 
inventor Gregory Tierney prior to January 13, 2004, as evidenced by the invention disclosure 
describing the claimed invention that we submitted on June 13, 2003 to Hewlett Packard's legal 
department (see Exhibit A). 

3. Indeed, we conceived the claimed invention prior to June 13, 2003. 

4. The declarant further states that the above statements were made with the 
knowledge that willful false statements and the like are punishable by fine and/or imprisonment, 
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or both, under Section 1001 of Title 1 8 of the United States Code, and that any such willful false 
statement may jeopardize the validity of this application or any patent resulting therefrom. 



Date: October 5 t 2009 

Simon C Steely Jr. 



WAS: 155377.1 
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INVENTION DISCLOSURE 



DATE RCVD: 06/13/2003 
PDNO: 200313752 



ATTORNEY: LPG 



AH IPG inventors and other inventors who have access to Disclose should submit their disclosures 
through that application. The url is 

< https;//wkrpwebl.cv.hp,com/dbi?application=disclose >. Invention Disclosures submitted here 
by inventors who have access to Disclose will not be processed at all. 

Instructions: The information contained in this document is HP CONFIDENTIAL and may not be disclosed to others 
without prior authorization. Submit this disclosure to the HP Legal Department as soon as possible. No patent protection is 
possible until a patent application is authorized, prepared, and submitted to the Government. 

Red text indicates a required field, 



Descriptive Title of invention: 

Owner Prediction with Processor-side Directory Caches in a Distributed cc-NUMA system. 



Name of Project: Windjammer 



Product Name or Number: 



Submitter Location (City): Other 



Was a description of the invention published, or are you planning to publish? If so, the date(s) and publication(s): 



No 



Was a product or prototype including the invention (i) announced, offered for sale, or sold to any third party (for example, 
customer, supplier, contract manufacturer), or (ii) sold to HP by, for example, a supplier or contract manufacturer, or (iii) is 
such activity proposed? If so, when and to whom?: 



No 



Was the invention disclosed to anyone outside of HP, or will such disclosure occur? If so, the date(s) and name(s): 



No 



If any of the above situations will occur within 3 months, call your IP Attorney or the Legal Department now at 1-898-4919 
or 970-898-4919 



Was the invention described in a lab book or other record? If so, please identify (lab book #, etc.) 



No 



Was the invention built or tested? If so, the date: 



No 



Was this invention made under a government contract? If so, the agency and contract number: 



No 



Description of Invention: Please describe your invention in detail using the following outline. 

A. Prior solutions and their disadvantages (attach copies of any pertinent product literature, technical articles, patents, etc.). 

B. Problems solved by the invention. 

C. Advantages of the invention over what has been done before. 

D. Description of the construction and operation of the invention. 

(include appropriate schematic, block & timing diagrams, drawings, samples, graphs, flowcharts, computer listings, etc.). 
Electronic Attachment 



List any pertinent patents material to the invention. 



List any articles or references or devices pertinent to the invention. 



Identify Inventor(s): Pursuant to my (our) employment agreement, I (we) submit this disclosure on this date: 06/13/2003 



Employee No. 4719590 



Name: Simon C. 
Steely, Jr. 



Telnet: 508-467-4631 



Mailstop:MR01-l/P5 



Entity & Lab Name: 
HPSL - MRO section 



Employee No. 



Name: Gregory 



Telnet: 508-467-4499 



Mailstop: MR01-1/P5 



Entity & Lab Name: 
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|Edward Tierney 


1 


i i 


1 


1 


|HPSL - MRO section | 


Employee No. 


Name: 




Telnet: 


Mailstop: 


Entity & Lab Name: 


Identify Witness(es): (It 's best to identify the person(s) to whom invention was first disclosed) 


The invention was first explained to, and understood by, (he witness(es) on this date: May, 2003 


Name: Steve Van 
Doren 


Employee No. 


Telnet: 


Mailstop: 


Entity: 


Name: 


Employee No. 


Telnet: 


Mailstop: 


Entity: 


Inventor & Home Address Information: 


First Inventor's Full Name: Simon C. Steely, Jr. 




| Citizenship: U.S.A. 


Street 8 Anna Louise Dr 


City Hudson 


State NH 




Zip 03051 




Do you have a Residential P.O. 

Address? 

No 


Description 


Second Inventor's Full Name: Gregory Edward Tierney 


1 


Citizenship: U.S.A. 


Street 161 Boston Rd 


City Chelmsford 


State MA 


Zip 01824 


Do you have a Residential P.O. 
Address? 


Description 


Third Inventor's Full Name: 


Citizenship: | 


Street 


City 


State 






Zip 




Do you have a Residential P.O. 
Address? 


Description 



Hardcopy Files: 



D 

owner prediction.zip 
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Owner Prediction with Processor-side directory caches 
in a Distributed cc-NUMA system. 

The Invention 

See co-pending invention disclosure entitled, "Owner- Speculative Scalar Protocol for 
Distributed cc-NUMA" for an example of a protocol that supports owner prediction. See 
the paper, "Owner Prediction for Accelerating Cache-to-Cache Transfer Misses in a cc- 
NUMA Architecture", by Manuel E. Acacio, et al for an example of another approach. 

Owner prediction is the process of guessing the third-party cache location of the coherent 
version of the data. A system using owner prediction will send a request packet directly 
from the requesting node to the third-party cache and return the data in 2-hops. A 
protocol that sends its requests only to the home node would require 3 to 4 hops to 
retrieve data from a third-party cache.. 

This invention is an approach to performing owner prediction that works well. We've 
done performance simulations for this approach as part of our Windjammer investigation 
work. See file procside_cacheing.ppt included with this disclosure. 

A Directory Cache is a structure that is in front of each directory and remembers the 
recent entries accessed in the directory. The structure is built out of cache technology 
that allows it to be accessed much faster than accesses to the actual directory. This 
invention is concerned only with the directory accesses that change the owner of the line. 
Our simulations have shown that modest sized directory-caches have high hit rates on 
blocks that are owned by a remote cache. 

This invention places structures that behave similar to a directory cache at each processor 
node. The structure is organized to identify blocks that are owned by a third-party cache. 
The contents of the structure are manipulated with system commands from the directory 
controller. For each directory access that results in a change to the ownership, a message 
containing the address and new owner of the line is broadcast to each of the owner- 
prediction structures. 

As a miss comes out of the processor, it examines the owner-prediction structure. If a 
match is found it proceeds to obtain the data from the third-party owner. Note, in using 
the OSSP protocol (mentioned in co-pending disclosure) request messages will be sent to 
both the third-party and the home-directory in parallel. The protocol does not require that 
the predicted owner is actually able to service the request. Thus, the owner-prediction 
structure need not be meticulously maintained to avoid race conditions that may develop 
while ownership state changes. 



An alternative embodiment could be included to work with co-pending disclosure 
entitled, "Mechanism Using Coherent Signal to Validate Eager Reply from Shared Copy 



Located in Neighbor Cache" to recognize and broadcast sharing update information to 
appropriate close neighbors as well. This would make the identification of shared copies 
highly accurate as well. 

Another alternative embodiment would be to combine this processor-side, cache-like 
owner predictor with a conventional pattern-based owner predictor. The processor-side 
directory-change updated predictor works well when the ownership change was timely 
(happened long enough ago to allow the network latency of the update to be buried). 
However, when the ownership change has happened very recently (this miss occurs in the 
shadow of an update to the processor-side directory-cache predictor) then the pattern- 
based predictor can help out by identifying some target nodes to guess on where the data 
is. A pattern-based predictor could also be used to better manage the contents of the 
processor-side predictor, for example by assisting the replacement policy of the owner- 
cache to improve the hit rate of misses that are most critical to the processor. 

Advantages 

This prediction approach is more precise than the pattern-matching strategies of other 
known approaches, because its use of system update messages provide the true ownership 
state of a line. This yields the following advantages: 

• Able to record only a single owner, whereas less precise approaches often require 
several parallel guesses. Our predictor is thus able to hold more blocks in the 
same amount of storage space than is required by an approach that records 
multiple ownership targets. 

• Only need to store state for those blocks that may be cached at a third-party. If an 
address is not found in the predictor, it can assume that the data can be found in 
memory and that speculation is not beneficial. 

• State updates are more timely. Most cache-to-cache transfers occur after a block 
has been written, which is often identified in our system with an ownership 
change. Alternatively, pattern-based predictors are trained by the prior miss 
and/or prior probe. Timely updates increase the likelihood that a third-party 
target can be accurately identified by our predictor. 

Problems Solved 

Identifying the location of remotely cached data reduces the effective latency of accesses 
to the system. 

Technology trends are such that link bandwidth between nodes in a distributed system 
will be higher than the bandwidths of the nodes that are being interconnected. This 
means that protocols that broadcast information will be more likely to work well since 
they won't suffer the queueing delays encountered in present networks. 



Broadcast of predictor table information from directories to nodes will be less of a 
handicap as technology improves. The subsequent reduction in effective system latency 
will result in increased system performance. 

Disadvantages of the Past 

Earlier approaches, such as the ownership prediction scheme in the Acacio paper reply on 
information from fills and other responses to train the ownership prediction structure. 
This works only if there is a pattern to the processor that owns the data. The prediction 
solution here works even without patterns in the owner of the data. 



D- 



ill 

to 



V) 

ft) 

-f- 

O 

+- 

ft) 



cn 
c 

N 

• MM 

£ 

o 
-»- 

V) 

ft) 



<\l 
O 
O 
(\J 



ft) 
■»- 

if) 



ft) 

ft) a 
F > E 

?l 1 
^ §" in 

4- 



u 
a 
o 



CD 
i 



LU 



O 
o 

Q. 
I 



CM 

<D 

ro 
a 



8 

in 

.E c 

c 

CD 
.Q 

to 



Q. 
O 

ro 




ro 

<D 
U) 
TO 
CL 





oooooooooo 
oooooooooo 
oooooooooo 

OCDCON-COm^COCNlT- 




u 

CM 
U 

"O 
C 

03 

a 




E 

x 

LU 




00 
0) 

cn 

s 



Id 



< 
i- 
z 



o 

o 



U) 

c 

I 

8 





Q 

Q_ g 

CD to 

— cu 

u 

CM 

u 

i -a 

I 

cfi 



Q_ 



2 
O 

a 

CO 

CD 



CD 

U 
H3 
CJ 



O) C R = " 



Iff 



Q 





CO CO 

CD CD 



73 
C 

= 2 

CO —J 



CD 

£ 
8 

CD 



CO 



u_ co 

? CD 

CD c 

"D = 

"co t_ 

L_ O 

o ~ 

co ,P 

CO *fe 

CD 

£ H w 

Q- u> CD 

to S£ 




o 

CD 

CJ 
TO 

y 

to 
Q. 

D) 

si 
1 8 

« CD 

|S 

CD C 

^cC 



Q. 



